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PROCESS FOR MAKING GENES ENCODING RANDOM POLYMERS OF AMINO ACIDS 



Background of the Invention 



Cooolvmer 1 (COP-D is a synthetic polypeptide analog of myelin basic protein (MBP), which is a 
natuS component of the myelin sheath. It has been suggested as a potential therapeutic agerrt for multple 
natural component ' : 1£4& d j NeuroU ^ [19 77] 31:433). Interest in COP-1 as an 

SSSSSr V3f.^^*-" observations first made in the 1 ^thatmy elin 
Sm^nerS such as MBP prevent or arrest experimental autoimmune encephalomyel.tis (EAE). EAE .s a 
disease resembling multiple sclerosis that can be induced in susceptible animals. 

COpT^w developed by Drs. Sela. Amon. and their co-workers at the Weizmann Institute (Rehovot 
Israel) It waTshown to suppress experimental allergic encephalomyelitis (EAE) (Eur. J Immunol. [1971^ 
?42-k48 ?S Patent No. 3.849.550). More recently. COP-1 was shown to be benefice for patients with 
me exace^ingrremifflng form of multiple sclerosis (N. Engl. J. Med. [1987] 317:408-414) Patents itreated 
wSi Sily YnjeSons of COP-1 had fewer exacerbations and smailer increases in their disab.lity status than 

the conuo. f-J^J^ of y ptides composed of alanine . gIutamic acid, lysine, and tyrosine In a molar 
ratio of approximately 6:2:5:1. respectively. It is synthesized by chemically polymerizing the four am.no 
SdsfoSg products with average molecular weights of 23.000 daltons. Although the resu ting polypep- 
tides Srcomprised of the same amino acid components, they differ with respect to their ammo ac,d 
seauences in feet, there are 10'°<> possible ways to assemble a 23.000 dalton polypept.de composed o 
aSne glutlVc acid, lysine and tyrosine in the designated ratios. Purification of one or even a small 
number of distinct COP-1 polypeptides from chemically-synthesized COP-1 is not poss able 

Studies evaluating COP-1's efficacy have been hindered somewhat by inconsistent batches o COP-1. 
Also tt TJtZn which of the amino acid polymer(s) is responsible for the ^^^^ 
OfteV random sequence amino add copolymers related to COP-1 have been chem.cally synthesized and 
STted fo" S abmty to suppress experimental allergic encephalomyelitis (Eur. J. Immunol. [1973] 3:273, 
^unochemistry ?976] 13:333). Biological activity was observed ^^"S^JSi^ 
polymers in which one of the following changes occurs: tyros.ne is replaced by tryptophan, or glutamic acid 
is renlacad bv aspartic acid; or tyrosine is excluded. . 

We nave developed procedures for synthesizing genes encoding polypeptides cornposed of specific 
amino acL but having random amino acid sequences. The amino acid composition o the polypeptides is 
TxZTby the set of codons incorporated in the synthetic genes. Likewise, the size of the polypepbdes is 
controlled by synthesizing genes of specific lengths. 



Brief Summary of the Invention 



The subject' invention concerns" a method for synthesizing genes which encode random polymers of 
amino acids A further aspect of the invention is the identification of certain polypeptides which , are 
pressed by the synthetic genes and which have high levels of bio.ogical activity. The general method- 
oloov of the subject invention is outlined in Figure 1. 

a ImJiLo in the novel process of the subject invention is the polymenzafcon of small 
oligonucSe duplexes. Preferably, the oligonucleotide duplexes consist of a multiple J ^ree nucleotide s 
The tenoA of the synthetic genes can be controlled through the use of adaptors specific for the 5 and 3 
en^s o?5 fl oSon^cleotides. Further, the composition of the resultant polypeptides can be vaned with 
respect to the relative proportions of the amino acid constituents, by varying the proportons of input 

^Tnf sykosis" o^peptides similar to COP-1 exemplifies the procedures of the subject invention Tne 
initia^tepSe Procedure is the synthesis of genes which code for po.ypeptJdes consls ting °»P^eter- 
in Hal step in ™ p qenes v then cloned in an expression vector and Introduced into E. 

ifype P «des (analogous to th chemically synthesized product) we produce COP-1 ^VP°^°* *-» a 
pool of recombinant bacterial colonies containing COP-1 gen sequences, e.g.. 1000 col n.es. 
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The efficacy of the pool of recombinant COP-1 polypeptides is tested .n experimental allege 
k J™SL rLS assays If effective, the pool of colonies exhibiting activity is further subdivided 

<e.g.. pools of 100 ^r^. ana ' «^PJ^ jntJiv . dua , recombinan t COP-1 polypeptides or 

STroCps of activity in £e assays equal to or higher than chemically 

sy^S C^ Copportunity to characterize homogeneous, individual COP-i po.ypepfcdes .s uraque 

synthesized usmg J"^"*^" ' Secifically. a preferred copolymer may consist of alanine, lysine. 
SEnKST-S £2^^ wefght between'about 5.000 and 50,000 daltons. Further, 

the method of the subject invention can also be used to make fus.on prote.ns. 
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Brief Description of the Drawings 



Figure 1 depicts the general methodology for synthesizing random genes and identifying polypep- 
tid es specific strategy for synthes * ng genes encoding random-sequence po.ypep- 

" t,deS ' Figure 3 shows all possible 3-amino acid combinations and their percent occurrence In COP-1. 
Figure A shows 9-nucleotide duplexes and adaptors for COP-1 gene synthesis. 
Pioure 5 shows the construction and cloning of synthetic COP-1 genes. ^h~«,«,„ 
figure 6 prSes the sequence analysis of one synthetic gene revealing proper junctions between 

3 o duplexes^^ 7 . s a westem btQt showing fusion protein produced by four different ciones. 

Figure 8 shows variations of random-gene synthesis using small duplexes. nolvmers 
Figure 9 shows synthesis of single-stranded DNA encoding random J P °^y n . 

Figure 10 shows a phosphoramidite trinucleotide for random gene synthesis usmg a DNA syn 

35 tnes,z8 pj gure 11 shows fte DNA and amino acid sequences of rCOP-1-77. 

Fiouro 12 shows the DNA and amino acids sequences of rCOP-1-19. 

F^ure 13 shows average EAE scores of disease-induced guinea pigs wh.ch were untreated or 
treated with myelin basic protein. rCOP-1-19. or rCOP-1-77. 

Detailed Description of the Invention 



40 



45 



SO 



55 



3 



EP 0 383 620 A2 



To achieve this ordering, the sticky ends at each Junction must be unique. Thus, the Process of the subject 
IverSon "s unique in that random-sequence genes are synthesized using ciigonuc.eot.de dupiexes 
encodino small segments of amino acids, and the sticky ends on each duplex are the same. 

ZZcedM for synthesizing genes encoding recombinant COP-1 po.ypeptides entails using 
o.igonucleotioe dup.exes eroding segments of 3 amino acids. Ail possib.e permutation* , o ^ 
■1 T?ZZL*a eomorised of the four COP-1 amino acids and their percent occurrences in COP-1 are 
recombinant COP-1 genes we have synthesized ongonuc.eotides correspond- 
Tn Q to le c^ing and noncoding strands for some of the 3 amino acid segments (Figure 4). The 
oiLLTeotidts a?e ^osphorylated at the B ends. Complementary pairs of oligonucleotides a™neaj,d 
duote" s witf. the 3' nucleotide extending on each strand: adenosine on the coding strand, and 
Sine on *e Encoding strand. Adenosine and thymidine base pair with one another thus ensurmg that 
^duplexes alined dfrectiona..y. that is. coding strands to other coding strands. Since the dup exe 
h^e *i sa^rnucleotide extensions, they can align In any order. When duplexes correspondmg to all 
S-four 3^i"o acid b.ocks are mixed and iigated. COP-1 genes with ail poss.b.e sequences are 

pr0d xfrlntr O l the length of the synthetic genes, we have included adaptors specific for the 5 and 3' ends. 
0 „ e L^d i c ^f^ach Splex adaptor is nSt phosphory.ated (see F.gure 4). As a result. Hgation products 
° ne . Str t an t /hvdrSaid 1 ends By varying the ratio of the adaptor duplexes in the reaction, we can 
Tffn tenatn ^mf^eVc geneTxhe adaptors serve the second function of adding specific 
fxtens o^s reqTed to d.reSiy done me ligation products Into the vector ^^JZJZZ 
o^cSEs^amlno acid constituents in the random polymers can also be modulated by m.xing the 

riirtated bv the percent occurrence of the corresponding amino acid segments in COP-1 (Figure J). 

tK number of different duplexes incorporated determines the sequence complex^ of the synthetic 
aeneT For example. COP-1 genes can be constructed using fewer than 64 dup.exes but not all sequence 
clbina?onrw7occur. The amount of sequence complexity required will depend on the application of the 

'^^complication arises In producing completeiy random COP-1 amino ™'je?» en ™\ Fo <^Z 
♦o ^e same extensions, the 3' nucleotide on the coding strands must be the same. Codons for 
LnTe oMaSncid a^d lysine can end with adenosine; however, for tyrosine, the last -r.ucieot.de ,s either 
c2l or Se ThuT duplexes encoding three amino acid segments ending in tyrosine (fourth column 
oTSure 3) wThav ^ different extensions than the other dup.exes. This limitation can be overcome by 
on Figure 3 i wm ™ extension of guanosine or adenosine. Because 

^S^^X^ DNA syndesis, we have elected to exciude dup.exes corresponding 

M t0 ^^SS^^.^ Coned into a suitab.e expression vector and .ansfe^o 
* , ~n»L of exoressino the polypeptides. The host may be bacterial or eukaryotic cells. With bactena. 

u3TS« J £nn» formation of recombinant co.onies. Each colony will contein 
cans are grown ^unoer co h expressed by culture from specific colonies are Isolated 

2 K "e^vant S££?ZL***m*«». For example. COP-1 po.ypeptides are tested 

,0f To ^a^nlm^o, po.ypeptides for activity, it may be advantageous to poo. the po.ypeptides 
h-J? Po2 Polypeptide -can be generated by either combining colonies and .so.at.ng , the 

be ore Pools of pow ep « jndivjdual isoiaing tne polyp eptjdes 

from eaS " Ef£ ^eS^ the purified po.ypeptides. By testing pools of po.ypeptides. it Is 
from each culture ana in y ^ bio|ogjca||y or i mrnuno , 0 g IC ally active 

P °f S o,L For JxaSde tf pTpeptides from 100 colonies do not exhibit activity, these colonies can be 
P ? yP TT tam furST investSon If. on the other hand, the mixture of polypeptides does exhib.t the 
tTTsX ^ZT^ ^s of the pooled polypeptides can be tested until the active 

ne JfJisoLnTpStides' /or example, reactivity of po.ypeptides with antisera can be assessed ,n 
colonies grown and lysed on nitrocellulose filters. 

55 Materials and Methods 
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Svnthesis and Phosphorylation of Oligonucleotides. The coding and noncoding 0,, S^leoWes for 
.a^oSSy nthesSed by the ph osphite triester m ethod with an Applied Biosystems Mode 3H l DNA 
Ivn^esize? The 5 ends o ^gonuctecrtides are phosphorylated on the DNA synthesizer using (2-[2-<4.4 
Sc^tyToxyUy^ phosphoramidite from Glen Research, 
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PurfflcaJon of Oligo nucleotides . The phosphorylated form of the oligonucleotide ,s separated from the 
^.J^^^b^ eledophoresis th rough a 20% acrylamide gel containing 7 M urea. Oligomers are eluted 
crude m.xture ^ e '2?!Sde.»ltod on Sep-Pak C-18 cartridges. Separation of the 5 phosphorylated 

because hydroxykrted 0,i90mer c-u ~ *• Hsation reactions to 

terminate premawe^ ^ , nmo| each « . ^ ^ 

Annealing and Uga^n of °"9°™g e ° H y 6 mM ethy , e nediaminetetraacetic acid 

cSntg vector. To obtain a maximum yield of synthetic genes within the correct 

pmol of each adaptor f^***™™^ 500 units of T4 DNA IigaS e (New England Biolabs, Beverly. 
Ligation reasons are in " 75 m c °'^ rs w '« » 00 J ^ye^ene glycol is removed by 

M rTJcISroTc l^^S-u* are foncentSec by ethano. pr**«*n and 

chemical trials. The agarose plug containing the synthetic DNA Is stored at -20 C. 

P reparation of Expression Vector 

n,. drev 2 1 dI.otM c." be eonSnjcfd Iron, a placid PB81. PlasmW pBGI can M 

pREV 2-2. Uke pBG1. pREV2.2 expresses inserted genes behind the E. cou promoxer. 
between pBG1 and pREV2^ are the following: 

1 nRt=V2 2 lacks a functional replication of plasmid (rop) protein. 

2 PrSSI has t^A transcription terminator mslmed into the Aatl. site. Th,s sequence .nsures 
t-scrjtiont^^ to rt cnIorampneniC o,. whereas pBG1 
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standard procedures. . . , x . , _n/~< 

ic The product plasmid. pBG1 N. where the 2160 base pair Ndel fragment is deleted from pBQl. 
was selected by preparing plasmid from ampicillin resistant clones and determining the restriction digestion 
patterns with Ndel and Sail (product fragments approximately 1790 and 1650). This deletion inactivates the 
5 rop gene that controls plasmid replication. . 

2a. 5 ug of pBG1 N was then digested with Eco RI and Bell and the larger fragment approximately 

2455 base pairs, was Isolated. 

2b A synthetic double stranded fragment was prepared by the procedure of Itakura et al. (Itakura, K., 
J.J. Rossi, and R.B. Wallace [1984] Ann. Rev. Biochem. 53:323-356. and references therein) with the 
70 following structure: 

5' GATCAAGCTTCTGCAGTCGACGCATG 
3' TTCGAAGACGTCAGCTGCGTACGCCT 
» AGGCCATGGGCCCTCGAGCTTAA 5' 



20 CGGATCCGGTACCCGGGAGCTCG 3' 

This fragment has BclJ and EcoRI sticky ends and contains recognition sequences for several restriction 

2S end ° nl £! e o S i eS |Ig of the 2455 base pair EcoRI-Bcll fragment and 0.01 ug of the synthetic fragment were 
joined wi'thT4 DNA iigase and competinT cells of strain JM103 were transformed. Cells i hartor.ng fce 
recombinant plasmid. where the synthetic fragment was inserted into pBG1 N between the Bell and EcoRI 
sites, were selected by digestion of the plasmid with Hpal and EcoRI. The diagnostic fragment sizes are 
approximately 2355 and 200 base pairs. This plasmid is called pREVI. v 

30 2d. 5 ug of pREVI were digested with Aatll. which cleaves uniquely. 

2e. The following double-stranded fr agment was s ynthesized: 
5' CGGTACCAGCCCGCCTAATGAGCGGGCrrTTTTTTTGACGT 3 
3' TGCAGCCATGGTCGGGCGGATTACTCGCCCGAAAAAAAAC S 

This fragment has Aatll sticky ends and contains the trpA transcription termination sequence. 

2f. 0.1 ug ofAatll digested pREVI was ligated with 0.01 ug of the synthetic fragment in a volume of 

20 U ' 2g n9 c^H S D o? strain JM103. made competent, were transformed and ampicillin resistant clones 
selected. ^ ^ restriction digest of plasmid isolated from selected colonies, a cell 

containing the corricTcolJstruction was isolated. The sizes of the KpnI. EcoRI 0"™^^^*^ 
approximately 2475 and SO base pairs. This plasmid is called pREViTT and rontons the trpA transition 

termmator.^ ^ ^ P reviTT. prepared as disclosed above (by standard methods) was cleaved with Ndel 
and Xmni and the approximately 850"base pair fragment was Isolated. ^ nfnrrin „ 
45 -35 5 ug of plasmid pBR325 (BRL, Galthersburg. MD). wh.ch contains the genes confemng 

resistance to chloramphenicol as well as to ampicillin and tetracycline, was cleaved wtfh Bel I and the ends 
blunted with Klenow polymerase and dexoynucleotides. After inactivating the enzyme, the mixture was 
■ Seated wT Ndel and the approximately 3185 base pair fragment was isolated. This fragment contains the 
genes for chioTaTnphenicol and ampicillin resistance and the origin of rc •»*"«>»■ frnmn BR325 

3c 0.1 ug of the Ndel-Xmnl fragment from pREVITT and the Ndel-Bcll fragment from pBR32S i were 
ligated in 20 ul I with T4T5NAUga»« and the mixture used to transform competent cells of stram JM103. 
Cells resistant to both ampicillin and chloramphenicol were selected. 

a^EcoR. and Ndel double digest of plasmid from selected clones, a plasmid 
giving fragrnen? sizeTof .ppmtfmSriy 2480. 1145. and 410 base pairs. This is called plasrmd P REV1TT/chl 
and has genes for resistance to both ampicillin and chloramphenicol. 
4a- The following double-stranded fragment was synthesized: . 
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Mlul EcoRV Clal BamHI 

5' CGAACGCGTGGCCGATATCATCGATGG 

3' GCTTG CGCACCGGCTATAGTAG CT ACC 
Sail Hindlll Smal 
ATCCGTCGACAAGCTTCCCGGGAGCT , 3' 
TAGGCAGCTGTTCG AAG GGCCC 5' 

This fragment with a blunt end and an Sstl sticky end! contains recognition sequences for several 

reSWC 4 0 b " sTof pR^1TT/chl was cleaved with Nru. (which cieaves about 20 nucleotides ftjorn the B£ . 
site) ancf '^(wricS cleaves within the multiple clonTrTg site). The larger fragment approximately 3990 base 

P ^Z S ^TZ^Z S l!^^ pREVITT/ch. and 0.01 ug of the synthetic fragment were 

trpated with T4 DNA llgase in a volume of 20 til. M . MoH 

J This mixture was transformed Into strain JM103 and ampiciUin resistant clones were *^ted 
2' P^smid was purified from several clones and screened by digestion w.th Mlul ° r Clal 

Recombinanlllones^ STrSr muKiple Coning site will give one fragment when digested w,th e.ther of 

— verged. This was *»^«£Z2Z 

witt, Hpa. and ?Z> and isolating the 1395 base pair fragment, cloning it into the Smal s,te of mp18 and 
sequencing it bTdideoxynucleotide sequencing using standard methods. 

p££S£J2 ^SStm Hs E. coli host by well known procedures This plasmid was 
rt nn„S1n ttfe E coD host JM103 with the" Northern Regional Research Laboratory (NRRU U.S 
Dependent ofTgrifulS Peoria. Illinois. USA) on July 20. 1986 and was assigned^ access.on number 

NR Rasmfd^pREV 2.1 was constructed using plasmid pREV 2.2 and a synthetic oligonucleotide. An 

6XamP ; e Sr M lB ^ , SfS^- , 'J! Son enzymes Nru. and BarnH, and the 4 Kb fragment is 
isolated from an agarose gel. ' , 

2. The following double-strand oligonucleotide is synthesized. 

5' CGAACGCGTGGTCCGATATCATCGATG 3' 
3' GCTTGCGCACCAGGCTATAGTAGCTACCTAG 5' 

3. The fragments from 1 and 2 are ligated in 20 ul using T4 DNA llgase. transformed into competent 
p ™mi cells and chloramphenicol resistant colonies are isolated. ■■ 
- -4 P^d cS'are identified that contain the oligonucleotide from 2. spann ngl *e reg«n from the 

i . .7* ,„ tha RflmH i site and recreating these two restriction sites. This plasmid is termed pREVZ.i. 

^Sr-TarTpt'Sch illustrate procedures. Inc.uding the best mode, for , jnacjcing the 
invenuUn T?es?examp.es shou.d not be construed as ..mtting. AH percentages ar by weight and all 
s Ivent mixture proportions ar by volume unl ss otherwise noted,. 
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Example 1 - Strategy for Synthesis of Random Sequence Genes 

Model studies using several oligonucleotide duplexes were performed to assess this method for 
synthesizing genes encoding polypeptides of predetermined amino acid ccmpos.tton but random se- 
quences. The synthetic genes were analyzed with respect to size, ligation junctions, composition, sequence. 

SfsyrZSte^es within broad size ranges are produced by varying the ratio of adaptors to 9-mer 
dunlins We are able to select genes within more limited size ranges by resolving the ligation products on 
agarose gels and excising gel sttces containing products of a certain length. J^ing these procures, genes 
K following three size ranges have been isolated: 75-150. 280-320. and 400-600 nucleotides. 

SoaSonTunSons. The synthetic genes have been sequenced to demonstrate that the 9-mer duplexes 
are §^^15"^ and without insertion or deletion of any nucleotides. Correct junctions are necessary 
for SntainS, Sne reading frame and thus producing genes that encode polypeptides of the expected 
a^ino Srcomposmon. Sequence anarysis revealed that the junctions between the duplexes are correct 

^Composition. To control the amino acid composition of the encoded polypeptides, the synthetic genes 
mu st c^nSn ^n e duplexes added to the ligation reactions. The results from three gene syntheas 
Sperimenfs demonstrated that the synthesized genes are composed of the duplexes .ncluded m the ,nput 

^^Lqu^tncfeach dup.ex can ligate to any other, the order of the 9-mer duplexes In the synthetic 
geneTS" be random. This randomness is demonstrated in the Example shown in Figure 6 The 
%M gene shown in this example is composed of duplexes encoding the following ammo acd 

^rn^urf e^Lion levels, synthetic genes are cloned into vectors such that the 
pol ypes a re expressed either as fusions with heterologous vector-derived peptide sequences, or as 
infusion polypeptides. The levels of expression of the fusion products can be read.ly measured by 
WeJeTbtot anajysis using antisera directed against the vector-derived portion of the fus.on protem. Rgure 
7 demonstrates the expression of four COP-1 -containing fusion polypeptides. 



Example 2 - Variations for Synthesizing Random Sequence Genes with Small DNA Duplexes 

DNA duplexes of other lengths can be used in an approach similar to that described for duplexes ; of 9 
nucleotides Since three nucleotides code for one amino acid, strands of 3. 6. 12. 15. or 1 8 nucleotides can 
be Seated foxing dup.exes that code for small blocks of amino acids (see Figure 8). More sequence 
variation occurs by mixing small rather than large duplexes. _ _ „ 

in addition, terminal extensions of more than one nucleotide can be employed. Figure 8 shows an 
example using duplexes of 12 nucleotides and extensions of 3 " ucleotid ^-/ W / e P re ^ n f n ^"^ n £ 
codons in the duplexes axe varied to produce polypeptides of the des.red amino ac.d compoation Th.s 
Speech restricts the polypeptide sequences since the amino add encoded by the .*Pl« J"«*om - 
afa^ine (Ala) in this example - is repeated every fourth amino acid. Also, in another vanadon of the subject 
rnvention? extrnsion^ of 5 ? nucleotides can be employed instead of the 3 extensions illustrated throughout 
this report 

Example 3 - Synthesis of Single-Stranded Random Sequence Genes 

Gene synthesis using DNA duplexes results In double stranded genes that are ready for cloning into an 
expression vector. An alternative strategy entails producing single-stranded, random sequence genes The 
aXcatio! oHms method to COP-1 is illustrated in Rgure 9. Three-nucleotide oligomers correspondmg to 
confer each of the amino acids in COP-1 are synthesized, mixed in appropriate ratios, and chemiojly 
ToSzed " solution to produce long single-stranded COP-1 genes. The ^ 
is made enzymatically using reverse transcriptase or DNA polymerase. The double-stranded DNA is 
rjreoared for cloning by digestion and repairing the ends of the molecules. 

P %mgle^dS random sequence genes could also be made by performing the P°»"«"^^ 
a DNA synthesizer. Typically, synthetic DNA is assembled one nudeotid at a time us.ng phosphoramidite 
nuc^tidC^uLrs We developed a strategy for synthesizing single^tranded ONA In three nucleotide 
Segments ("codons") by using phosphoramldite trinucl otides (Rgure 10). The use of 3-nucleot.de bu.ld.ng 
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M i~ .„«t« a d of anal nucleotides Is necessary to ensure that only specific codons. those corresponding 
blocks mst ead I of smgl n " c ™° WQu|d |n synthetic ge nes. To. test this strategy, we 

Ln^sTe^ a phosphoramidKe trinuc.eo.de. We observed pCymer^on o, the 

trinucleotide on the DNA synthesizer. 

Sample 4 • Expression of Random Sequence Genes in Vector Producing Fusion Proteins 

whetic random sequence genes can be cloned into gene fusion vectors so that the pressed 
Synthetic random seq i , polypeptide linked to the random sequence polypeptide, 

polypeptides are f°™P™* !* ^p^Zs^d here. Synthetic COP-1 genes are cloned into the 
The applicator , o ^ polylink er site. Upon expression from pRev 2.1. a 

rS^S S Portion (approximately 25 to approximately 45 

20 bacteria, peptide alone polypeptides cornprise d of 34 amino acids of the vector-derived 

P^pSSS^" «» ^ C,«,e 4 produc.. low., 0, W«* —I- 

t6St f 3iIo?KS?l" other fusion vectors can be employed to express COP-1 fusion polypeptides 

» employed to release the 
example, to .mprove the e^io-f rCOP- pc J^^^^"^ „ pB G3-2AN 
rCOP-1-19. were .^lonec from ^VLnt No 4 6^^.009. The deposit was made on November 20. 

rrr^r-J™- 2 -co., «. 7 

p.asmids encode fusion proteins cons.stng of J^^'^^^^ fr0 m tSe 5 linker 
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behind following CNBr cleavage of the fusion protein. 

The invention should not be limited to the examples descnbed above. 

Example 5 - Expression of Random Sequence Gen s in Non^usioo Vectors 

Synthetic genes can be cloned into expr ssion vectors such that the polypeptide products are not fused 
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to veetnr-derived protein sequences. As In fusion vectors, these non-fusion vectors contain the wmpriato 

i25S^TJ*««« si 9 nals: however ' me syntheec 9enes l,nked d,rect1 y, ad ; acent to * 9 

S "i^aticn sSnal such that they contain a methionine residue as the amlno-tenr, Inal amine , aad 
S SSi CEi « be removed from COP-1 polypeptides by ^cyanogen .^edeavage. COP-1 
po ypepSdes with and without this amino-terminal methionine are tested for bolog.cal actvrty. 
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Example 6 - Purification of rCOP-1 Polypeptides 

-n« rlrlfication of rCOP-1 polypeptides can be accomplished by a number of methods which are well 
u ^ , 0 T 0 s?Sld in Sis art For example. EL coO cells expressing Protein A/rCOP-1 lusion proteins are 
known to those^ sk-lled in w *l p ^ ^ ^ extract h c-nlnftJ8?d to 

°^^h^r3^ 2t«5n 8 M urea, and chromatography on an S-Sepharose column using a 
remove debns • ° c0 rfons *m ^ on protei n are dialled agamst a soluton 

sodium ^r d8 h ^" u!Le !d saline- the dialysate Is centrifuge* to remove contaminating protems that 

rCOP-1 polypeptide is purified by gel filtration and reverse phase HPLC. 



20 Example 7 - EAE Experiments 

r- no < h« been tested for efficacy in suppressing experimental allergic encephalomyelitis (EAE). As 
. ^ " ve EAE s a T-Te I mTdiaTed autoimmune disease that is employed as a model for the human 

(Swanborg. R.M. l tbbbj p , ^ disease is , nduced in Hartley guinea pigs by a single. 

k^hh 1 1 b^wel and 4 = extensive parafysis. inability to move. Animals are scored every 2-3 days from 
bladder or bowel, and 4 e*en ^ P ? XaneousW recover hom the disease. The treatment protocol 

SnZ o r^SSiSSSan. of «00 mg of test materia, at 1. 6. and 11 ^ 

~" S '!r "1 dosaae ro ute of administration, and schedule for treatments can be varied. Also. EAE 
JSS^eiiryS-d in other species induding rats and mice. Other variations of the expenmen- 

I^peSs of dlseai. such as incidence, day of onset, maximum severity, and duration, ara compared. 



25 



30 



35 



Table 1. 



45 



SO 



55 



Effects on rCOP-1 and MBP on EAE 


Group Duration 2 


Incidence 1 


Day of Onset 2 


Max. ! 


Severity 2 


Untreated 

rCOP-1-19 
rCOP-1-77 
Myelin Basic Protein 


7/7 
8/8 
7/8 
8/8 


13.5 ±1.0 (13-15) 
16.813.7(13-25) 
17.6*2.8 (15-22) 
18.1 ±2.6 (15-20) 


2.3 i 0.6 (1-3) 

2.8 i 1.4 (1-4) 

2.9 ±1.0 (1-4) 
2.3±1.1 (1-4) 


10.8 * 3.8(4-15) 
8.6* * 5.2 (2-15) 
9.9 * 4.5 (2-15) 
5.3 * 5.5 (2-15) 



1 Incidence = numoer wtu» visw^.™.— • 

2 Values are the mean standard deviation. Ranges are in parentheses. 

3 One animal in this group died. 



10 



EP 0 383 620 A2 



75 



20 



25 



Tne reS u.ts can b e ^ as - J^^g ~ S^^^HlEr- 
U sing the ^tment ^ = J^J 1 ^^ onset and decreased 

disease-inc.dence. ^^^"J^ ^ MV i«y or Incidence. One note about this particular 
duration of ^^Jlm ^ iT^ undated animate (2-3) is unusual.y km; in a pjtt 
S 5SS Sh^ Z JUr scores were a and * it may be possible to ophm*e the 

effects of rC0P-1 by varying the treatment procedure. 

,o Example 8 - Other Appjjgtjons lor »-nrinm Sequence Polymers of Amino Acids 

(Benel encoding random seance 
using the procedures described I mm** ^J^^^^SSb can be anticipated, 
amino acid composition and length. ^^J^^T^^ ^position may be useful additives to 
Random sequence polypeptides of P^^£l?hdJTto ^Tduction of disulfide bonds and the 
hair care products. One common type of chtges Sat praduce undesirable effects on hair, 

subsequent oxidat.on of cystine residues Jj^^^^^ and neutralize these effects. 
Random sequence polypeptides may be able ^^^^^^ wil , have different physical 
Polypeptides of different .engths am.no aod ^^^S^l^ -o.ecu.es may confer 
properties such as charge, solubility, and aDiury to a combing the 

Leficia. effects on damaged IJ* sue, as Per example, a 

hair. The effects may vary depending on me P v neutra , ize the negative charge Of 

^tTp^ polypeptides having a predetermined amino acid composi- 

tion is as supplements for diets deficient in certain amino ac.ds. 

so Claims 

duplex so that the duplexes can align and Hgate ^ constjtuents ta said 
g. A method, according to claim 1. wherein saia sy " tyrosine or tryptophan. 

■^nsr^^»^^ ra -* 

5,000 and about 50.000 daltons ^ for a polypeptide which is 
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predetermined amino acid constituents, said method comprising the synthesis of genes eroding said 
random polymers, said synthesis comprising the polymerization of small oBgonudeotid duplexes, said 
method for making polypeptides further comprising the cloning of said synthetic genes intoan expression 
vector and transferring said vector Into a bacteria or eukaryotic cell capable of expressing said potypeptde. 
1 5 A method, according to claim 14. wherein said bacteria is an Eschenchia coll. 
16* A method for making a fusion polypeptide having one portion comprised of all or part ot a 
heterologous polypeptide and a second portion comprised of a random sequence ™ h * T ™*^™£™ 
secuence portion has predetermined amino add constituents, and said method composes the synthes.s of 
genes "ncoding said random polymers, said synthesis comprising the polyrnenxatJon of small 
ofigonudeotide duplexes, said method further comprising the doning of said ^^'JZZZ^ 
expression vector adjacent to a DNA sequence encoding ail or part of said heterologous polypeptide and 
3emng satd vector into a bacteria or eukaryotic cell capable ot expressing said fusion polypeptide 
I^A memod for synthesizing and identifying a biologically or Immunologically active polypept.de. or 

mixture of polypeptides, said method comprising the following steps: Man(lr . a 

(a) synthesizing genes encoding random polypeptides by polymerization of small ollgonudeotlde 

dUpleX pJ doning each of said synthetic genes into a vector and transferring said vector into a host capable 

^TcrSng S5 P S 8 under conditions which permit the formation of recombinant colonies which 

20 "* •S^^SJSS'S a mbcture of the recombinant polypeptides, combined either before or 
after isolation from said colonies, for evidence of biological and/or immunological activity; 

(e) where activity is observed for the mixture of polypeptides, generatmg smaller subsets of that 
combination and testing each of these subsets for biological activity: and „ htainad 

2S (f) repeating step (e) until the active component^) or a suitable mixture thereof Is obtained. 

18. A synthetic gene which codes for rCOP-1-19. 

19. A synthetic gene which codes for rCOP-1-77. 



so 



3S 



40 



so 



55 
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Strategy for Synthesizing Random Genes and 
Identifying RecLbinant Polypeptides with Biological Activity 



Figure 1 



mixture of 

oligonucleotide 

duplexes 



vww 



llgaie 



V^sA^W 



NAAAA/ 




random sequence 
synthetic genes 



>AA^V — 



clone in an expression vector 
transform into host 



recombinant colonies, 
each contains one 
synthetic gene 



identification o! biologically active polypeptides 

i 



r 

pool colonies (e.g.1000) 

l 

isolate polypeptides 
lest lor biological activity 

fractionate into smaller pools of 
colonies (e.g. 10 pools of 100 
colonies) 



select individual colonies 
isolate the polypeptide 
test lor biological activity 
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Figure 2 



Oligonucleotides 
(9-mers) 

anneal 
Duplexes 



Synthesis of Genes Encoding 
Random-Sequence 
Amino Acid Polymers 



^^^wX (coding) 
yn ^^^^^^^ (noncoding) 

^^^>X 



llgate 



Synthetic 
Genes 




expression 



Polypeptides 



Ala-AI^AIa-Lys-Lys-Lys-Ala-Ala-Ala-Lys-Lys-Lys-Ala-Ala-Ala 
Ala-Ala- Ala-Ala- Ala-Ala-Lys-Lys-Lys-Lys-Lys-Lys-Lys-Lys-Lys 



Lys-Lys-Lys-Ala-Ala 



,-A!a-Ala-Ala-Ala-Lys-Lys-Lys-Ala-Ala-Ala 



x Ala Ala Ala 
•X Lys Lys Lys 
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Figure 3 



All Possible 3 Amino Acid Combinations 
and Their Percent Occurence in Cop 1 



AAA 


7.872 


AEA 


2.62 4 


ANA 


6.560 


AY A 


1.312 


E A A 


2.624 


EEA 


0.875 


liKA 


2. 1137 


t 'i A 


0.HJ7 


KAA 


6.560 


KE A 


2. 1U7 


KKA 


5. 466 


KYA . 


1 .093 


YAA 


1 .3 1*2 


YEA 


0. 437 


YKA 


1 .093 


VTA 


0.2L9 



AAE 


2.624 


AEE 


0 .875 


AKE 


2. 187 


AYE 


0 .437 


EAE 


0.875 


EEE 


0 .292 


EKE 


0.729 


EYE 


0,146 


KAE 


2. 187 


KEE 


0.729 


KKE 


1 .822 


KYE 


0.364 


YAE 


0,437 


YEE 


0.146 


YKE 


0.364 


YTE 


0,073 



AAK 


6 ,560 


AEK 


2. 187 


AKK 


5.466 


ATK 


1 .093 


EAK 


2, 187 


EEK 


0,729 


EKK 


1 . B22 


ETK 


0.364 


KAK 


5.466 


KEK 


1 ,B22 


KKK 


4.555 


KVK 


0,911 


YAK 


1 .093 


YEK 


0 . 364 


YKK 


0.911 


YYK 


0. 182 



AAY 1.312 

AEY 0.437 

AKY 1.093 

AT Y 0.219 

EAY \ 0,437 

EEY 0,146 

EKY 0.364 

EYY 0.0 73 

KAY 1.093 

KEY 0.364 

KKY 0*911 

KYY 0,1B2 

Y A Y 0,219 

YE Y 0.073 

YKY 0.1Q2 

Y Y Y 0,036 



A= Alanine 
E=Glutamic Acid 
K=Lysine 
Y=Tyrosine 



• 
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Figure A 

9-Nucleotide Duplexes and Adaptors 
for Cop 1 Gene Synthesis 



Examples of 9-Nucleotide Duplexes for Cop 1 Genes: 

" © 9 

^ TarAAGGCA GAAGCAGAA (3- end) 
Coding (5'end) AAGAAGGCA (£r end) 

Noncoding (3' end) T T T C T T C ^) 

* K • A E ^ ^ 

Amino acids. N 



Adaptors: 




. ■ GATC 
For the 5' end ^ 



(BamH 1) ^ 



For the 3' end 




• 
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Figure 5 

Construction and Cloning 
Synthetic COP 1 Genes 



<Sj> 



5* adaptor 



GATC- 
(BamH I) 



<«J> 



GATC- 
(BamH If 



3> 



9-mers 

t — & 



3' adaptor 



AGCT 




(Sac I) 



ligase 



• AGCT, 



(Sac l) 



size-select genes 



GATC- 



clone into vector 



^AGCT Synthetic Gene 




Plasmid Vector 
cut with BamH \ 
and Sac I 
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Figure 6 

Sequence Analysis of a 
Synthetic Gene 



v K K*= A E K ■ A K 

amino acids: Y * ^ 



nucleotides: 



tacaagaaa[3aagcagaa|*aggctaaa| 

aminoacids: Y KKY K K K K A 
nucleotides: TAG A AG A A a|tAC A AG A A A^AG A AGO C A| 

aminoacids: E A E K K A E * E I 
nucieotides: G A A G C A G A a|a.A G A A G G C A^ A A G C A G A A| 

ic K A EA EE^'A E 
amino acids: K- K M . 

nucieotides: AAGAAGGC a|g A A G C A G A a|s AAGCAGAAJ 

aminoacids: K KAKAKE.AE 
nucieoUdes: AAGAAGGC a|a AGGCTAAApAAGCAGAA J 

aminoacids: K K A K K . A K A K 
nucleotides: AAGAAGGC a|a A G A A G G C A|A AGGCTAAAj 



The lines in the nucleotide sequence mark the Junctions 
between 9-mer duplexes. 



Amino Acids 
As alanine 
Kslysine 
Esgluiamic acid 
Y=iyrosine 
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Figure 7 



Cop1 
Polypeptides 



8 

i 

m 




CO 
CD 



-43 kD 



-29 
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Figure 8 

Variations of Random-Gene Synthesis 
using Small Duplexes 



One Nucleotide Extension: 



GCA GAA 



3-mers jcG TCT 



GCAAAA GAAGCA 
6-mers TCGTTT TCTTCG 



, 0mflPe GCAGAAGCAAAA 
12-mers TCGTCTTCGTTT 



\ 



Three Nucleotide Extensions. 12-mer Duplexes: 



duplex xxxxxxxxxgca 

CGTXXXXXXXXX 



1 



ligate 



none XXXXXXXXXGCAXXXXXXXXXGCAXXXXXXXXXGCA 
9 CG-T3OO0CXXXXXCGTXXXXXXXXXCGTXXXXXXXXX 



polypeptide Ala aa aa aa Ala aa aa aa Ala aa aa aaa Ala 



xxx» any codon 

aa=* amino acid specified 

by codon xxx 
Ala« alanine 
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Figure 9 . 

Synthesis of Single-Stranded DNA Encoding 
Random Sequence Amino Acid Polymers 



3 Nucleotide Segments for COP 1 

Ala 



Sgca 3 ' 



ratio: 



Glu 
SQAA 3 ' 



Lys 
^AAA 3 ' 



Tyr 
5 TAC 3 



Polymerize 

A. Solution Chemistry 
or B. DNA Synthesizer 



Single-Stranded DNA 5' 



-3' 



Enzymatic Synthesis 
of the Second Strand 



Double-Stranded DNA 



5* 
3* 



.3' 
.5' 



Clone into 
Expression Vector 
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Figure 10 



m 



DMTr 
I 

O 
I 

CH 



Phosphoramldlte Trinucleotide for Random 
Gene Synthesis Using a DNA Synthesizer 



H 2 „ Pi 

"? 

0=^—0 
? 

CH- 



? „ 

o=p— o 



Phosphoramidite 
Trinucleotide 




CH 2n P: 



CN -CH 2 -CH 2O CH(CH3)2 
CH (CH 3)2 



Phosphoramidite 
Mononucleotide 



DMTr 

? 

CHaO H — CH(CH3)2 
CH(CHa)2 



% 
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FIGURE 11 
rCOP-1-77 



y K K Y K K Y K K E 



j. E y K -K K A -K * a *. 

x^c^cg^^^^ 

- A E A A K A A X A A A A A * * K 
V K ^«»»..r.GeTAAAT>CAAGAAAAAGGCTAAAGAA 

K , X B X A A > A > K A K V * K K A « " 

A E , K K X A X A A A E A E « X K E * » 

QXAGCAGAATACAAGAAATACXAGAAAAAGGCTAAAAXG^CTAAXTACAXG AAAAAGGC^ 

, , , , « X * K X X A-.K XXX** K X A 

TTTCTTCGTCTTTTCCGATTTCGXMXCGTCTTCGTCTTTTCCGXTTTC^ 

XEXEKAKAAAEAEXA 
XJiGXAATACAAGAAAGAAGCAGAAXAGGCTAAAGAAGCA^GAATAA 
TTCTTTATGOTCTTTCTTCGTCTTTTCCGATTTCTTCGTCTTATT 
KK YKXEAEKAKEAE.* 



X A X E . A E ¥ " 
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FIGURE 12 
rCOP-1-19 



cgtttccgacgtc ; ; ; « x x k « i *■ . x . * x k . 

A K A A r* gaaAGCAGAGAAAGCAGAGAAAGCAGCTGCTGAA 



CGATTTATGCTTCGTTTCTTCCGTC, ^ ^ x £ 

A K Y E A K .^r.AGAAAGCAAAGGAA 



GTTTCTTCCGTCTCT1 
K A K E A K K A E K - ^~* r AGAAAGCAAAG 



TTCTTCCGTTTCCTTCGTTT^.-^ ^ 

KKA EKXKEAEKA 



K fcr AAGGCAAAGGCTGCAG AG AAAGCAGAG AAAGCAAAta 



cgtctctttcgtttccttcgtttcttccgtttccgj _ m fi xNa k 

^ E A K .K 



. A K A A E K A E 




TTCCGTCTCTTTCGTTTCCGA 

K A E K A 
GCGAAAGCAAAGGCTGCATAA 
. CGCTTTCGTTTCCGACGTATT 



K A A E K A K A A Y K K 



